ON THE TWO WEIGHT HUBERT TRANSFORM INEQUALITY 



MICHAEL T. LACEY 

Abstract. Let a and w be locally finite positive Borel measures on R which do not share a 
common point mass. Assume that the pair of weights satisfy a Poisson A2 condition, and satisfy 
the testing conditions below, for the Hilbert transform H, 



H(o-li)^ dw < ff(I) 



H(wli)^ do-< w(I), 



with constants independent of the choice of interval I. Then H(cr-) maps L^(cr) to L^(w], 
verifying a conjecture of Nazarov-Treil-Volberg. The proof uses basic tools of non-homogeneous 
analysis with two components particular to the Hilbert transform. The first is a global to local 
reduction, a consequence of prior work of Lacey-Sawyer-Shen-Uriate-Tuero. The second, an 
analysis of the local part, is the contribution of this paper. 
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1. Introduction 

Given weights (i.e. locally bounded positive Borel measures) cr and w on the real line 
consider the following two weight norm inequality for ttie Hilbert transform, 



(1.1) 



|He(fcT)|^ w(dx) < 



|f|' CT(dx), 



f eL^(a) 



we 



where !N is the best constant in the inequality, uniform over all < e < 1 , which define a standard 
truncation of the Hilbert transform applied to a signed locally finite measure y 



e<|x-i)|<e-' V ^ 
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We insist upon this formulation as the principal value need not exist in the generality that we are 
interested in. Below, however, we systematically suppress the uniformity over e above, writing 
just H for He, with it understood that all estimates are independent of < e < 1 . 

A question of fundamental importance is establishing characterizations of the inequality above. 
In this paper we answer a conjecture of Nazarov-Treil-Volberg [18, 31], and sharpen a prior 
characterization of Lacey-Sawyer-Shen-Uriate-Tuero [11]. 

1.2. Theorem. Let ct, w be two weights which do not share a common point mass. The inequality 
(1.1) holds if and only if the pair of weights a, w satisfy these inequalities uniformly over all 
intervals I, and in their dual formulation. (The dual inequalities are obtained by interchanging 
the roles ofw and u.) 



(1.3) 
(1.4) 



Jdist(x,I) + |I|)2 |I 
H(cTli)^ w(dx) < T^ct(I] . 



1 /2 

Taking A2 and 7 be the best constants of the inequalities above, there holds K ^ A2 + T. 

This is an extension of the T1 Theorem of David-Journe [5], to a setting in which the transfor- 
mation is fixed to be just a single operator, the Hilbert transform, but the weights are arbitrary. 
It is a further refinement of the real-variable characterization obtained by the author with Sawyer- 
Shen-Uriate-Tuero [11]. Indeed, the latter was the first such theorem for a continuous singular 
integral in the non-homogeneous setting pioneered in creative work of Nazarov-Treil-Volberg 
[15-18]. 

Note that the first condition is an extension of the typical A2 condition to a 'half-Poisson' 
setting, which is known to be necessary. The second condition (1.4) is called an 'interval testing 
condition,' and is obviously necessary. Thus, the content of the Theorem is the sufficiency of the 
Ai and testing conditions for the norm inequality. 

If the pair of measures share a common point mass, cr({x}) ■ w({x}) > for some x G 1., then 
the Ai condition is trivially false. In this case, one must change this condition, but then also 
check that the entire proof goes through, namely that there holds the energy inequality of [8], 
the functional energy inequality of [11], and that the argument of this paper goes through. This 
will be investigated elsewhere. 

The study of two weight inequalities for individual operators was initiated by Eric Sawyer, with 
the maximal function [28], and later the fractional integrals [29]. Indeed, the latter characterization 
was precisely in the T1 language, and importantly for this paper, that paper also proved the two 
weight inequality for the Poisson integral, a fact used in [11, §7], delivering an inequality that is 
fundamental to the proof of the theorem above. The similarities between Sawyer's results and the 
T1 theorem were brought to the fore with the work on Nazarov-Treil-Volberg on non-homogeneous 
harmonic analysis already cited. 

The question of the two weight inequality was raised by Muckenhoupt and Wheeden [13] in 
1976. Sarason recognized the same problem in his deep work [27], in which he constructed 
examples individually unbounded Toeplitz operators whose composition was bounded. In [26], 
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Sarason raised the question: 'Characterize those pairs of outer functions g,h in of the unit 
disk such that the operator TgTp;; is bounded on H^.' This is equivalent to the boundedness of the 
Riesz projection from L^(|g|^^) to H^(|f|^), see [4, §5], and from here, the two weight question 
for the Hilbert transform gained wider attention. This same note of Sarason sketches a complex- 
variable argument of Treil proving the necessity of the A2 condition.^ Sarason wrote that it was 
'tempting to conjecture' that the Poisson A2 condition was sufficient for the norm inequality. 
That the simple A2 condition is not sufficient is straight forward [13], but that the full Poisson 
A2 condition is not sufficient is a deeper fact, proved by Nazarov [14]. This was an important 
step in formulating the conjecture solved in this paper. 

The two weight inequality is also essentially equivalent to the question of characterizing those 
measures (x on the unit disk, for which a model space Ke, 9 inner, embeds into L^(D, A 
question arose there, that could be understood as asking if the Poisson A2 condition, and just 
one set of testing inequalities could be sufficient for the two weight inequality. This was disproved 
by Nazarov-Volberg [20]. The theory of Clark measures is highly relevant here, and in particular, 
there is no restriction on the class of measures that can arise as Clark measures, forcing one to 
consider arbitrary weights in this context. See [24] for more information. The Theorem of this 
paper also has applications to the theory of de Branges spaces [1], and to the spectral theory of 
(rank one perturbations of) normal operators [21]. 

Sufficient conditions for the boundedness of the composition of Toeplitz operators were given 
by Dechao Zheng, [32].^ This particular direction leads to the so-called 'bump conditions' which 
remains an active research direction, with a somewhat different focus than ours. 

In 2005, Nazarov-Treil-Volberg [18] created a method to prove two weight inequalities for 
Calderon-Zygmund operators for general measures, a component of their program of developing 
a non-homogeneous harmonic analysis. This innovative approach, incorporating the fundamental 
technique of random dyadic grids, and weight adapted martingale transform methods that have 
been integral to all subsequent approaches to this question, was strong enough to prove a certain 
variant of our main theorem for the triple of operators Ho^, Nl^j and M^, where M is the maximal 
function, see [18, Thm. 2.1]. This result was a notable success. Importantly, the argument 
proceeded by assuming that the pair of weights satisfied a supplemental pair of conditions, the 
so-called pivotal conditions, [18, §7.2]. This method of proof also had interesting implications 
[22] for the so-called A2 theorem, solved by Hytonen [7]. 

The individual operator two weight problem is of course highly specific to the operator in 
question — in particular, there is no implication between the maximal function and the Hilbert 
transform in this context. Likewise, there are no weak-type estimates, Calderon-Zygmund de- 
composition, or interpolation available. While the two weight theory is largely complete for positive 
operators, the non-positive case is much harder. For certain kinds of dyadic operators, positive or 
non-positive, there is an elegant characterization in [19]. And, for well-behaved measures, one can 

^Muckenhoupt-Wheeden used real-variable methods to show that the necessity of the half-Poisson A2 condition. 
There is a complex-variable proof of the necessity of the full Poisson A2 condition in [18, §3]. Also, the paper 
[13] includes results and conjectures in the LP setting. 

^In the language of this paper, the assumption is that w and cr have a density, and the Poisson A2 condition 
is assumed, with a power bigger than one imposed on the densities. See [32, §6]. 
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frequently reduce Calderon-Zygmund operators to dyadic ones. For the Hilbert transform, this 
can be done with the remarkable observation of PetermichI [23]; an extension of this to arbitrary 
Calderon-Zygmund operators is one of the important observations of Hytonen [7]. 

These dyadic methods obscure an essential aspect of the Hilbert transform, the last remnant 
of positivity: The kernel ^ has a positive derivative in x. This is the main observation in the 
proof of necessity of the energy inequality (2.4), an essential strengthening of the Nazarov-Treil- 
Volberg pivotal conditions [18, §7.3]. The energy inequality is necessary from the Ai and testing 
inequality. That is, it and it's further extensions are free to be used in the proof of the main 
theorem. Indeed, they must be used. The apparent difficulty in deriving the energy inequality 
from the dyadic models of the Hilbert transform are an important obstacle to dyadic approaches 
to our result. 

The paper [8] by the author. Sawyer and Uriate-Tuero, proves the energy inequality. Using 
conditions which in a certain sense interpolate between the energy and pivotal conditions, sharper 
sufficient conditions were given for the the two weight inequality. These sharper conditions permit 
one to construct an example [8, §7] of a pair of weights which satisfy the two weight norm 
inequality, but fail the pivotal condition. In a certain sense, it is the best example known, in that 
simple modifications give alternate derivations of other counterexamples, including the examples 
of [14,20], as well as the example in [25].'^ 

Still this example gives only the barest hints of the inherent difficulties in the proof of our 
main theorem. The paper [10] introduced the natural Calderon-Zygmund stopping data into the 
subject, essential to the subsequent developments. There was an important breakthrough in a 
previous paper of the author, Sawyer-Shen-Uriate-Tuero [11] obtained an unconditional charac- 
terization of the two weight inequality. The characterization is in terms of the Aj condition as 
in (1.3), but the testing conditions (1.4) were strengthened to JjH(crlE)^ dw < cr(I) for all 
intervals and all Borel measurable subsets E C I, as well as the dual condition holding. There 
was no prior real-variable characterization known, whereas a complex-variable characterization, a 
variant of the Helson-Szego theorem was established by Cotlar-Sadosky [3]. 

The proof of the main theorem, as mentioned uses the random grids and weight adapted 
martingale differences that are basic to the non-homogeneous theory. Then, aside from more 
routine considerations that are common to many proofs of T1 type theorems, the proof naturally 
splits into two parts. The first part is the reduction of the global inequality to one of a local 
nature. The essence of this part of the proof was found in a prior work of the author with 
Sawyer-Shen-Uriate-Tuero [11], which we recall in §3. This part depends upon a subtle multiscale 
extension of the energy inequality, one that itself is close to being stated in intrinsic form. It is 
proved in [11, §7] by appealing to Sawyer's two weight inequality for the Poisson integral [29]. 

After that, there is the control of the local part, which is largely contained in §4, a section 
devoted to the analysis of the so-called stopping form, with a highly non-intrinsic formulation. The 
stopping form is familiar to experts in the T1 theorem, but in all other settings, it is essentially an 
error term, expediently handled by some standard off-diagonal estimates. Any of these classical 
lines of reasoning will fail in the current setting. Instead, we construct a proof with a subtle 



■'There is a pair of weight cr, w with Mq- and bounded, but Hq- not bounded. 
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recursion, one analogous to proofs of the Carleson theorem on the pointwise convergence of 
Fourier series [2,6,12]. It is the main novelty of this paper. 

It is a pleasure to acknowledge the many conversations about this question that I have had 
with Ignacio Uriate-Tuero, Eric Sawyer, and Chun-Yun Shen. 



2.1. Dyadic Grids. A collection of intervals ^ is a grid if for all G, G' G Q, we have G fl G' G 
{0, G, G'}. By a dyadic grid we mean a grid V of intervals of R such that for each interval I E V, 
the subcollection {V E V : |I'| = |I|} partitions R, aside from endpoints of the intervals. In 
addition, the left and right halves of I, denoted by I-t, are also in V. 

For I E V, the left and right halves l± are referred to as the children of I. We denote by nx> (I) 
the unique interval in V having I as a child, and we refer to nx> (I) as the P-parent of I. 

We will work with subsets F E V. We say that I has F parent ttj-I = F if F G J-" is the 
minimal element of F that contains I. We also set tt^-I := ttj-I, and inductively set tt^U to be 
the minimal element of F that strictly contains tt^I. The F-ctiildren of F G J-" are the maximal 
V E F which are strictly contained in F. 

2.2. Haar Functions. Let a be a weight on R, one that does not assign positive mass to any 
endpoint of a dyadic grid V. If I G is such that a assigns non-zero weight to both children of I, 
the associated Haar function is chosen to have a non-negative inner product with the independent 
variable, {x,hi[x))a > 0, a convenient choice due to the central role of the energy inequality. 



In this definition, we are identifying an interval with its indicator function, and we will do so 
throughout the remainder of the paper. This is an L^(a)-normalized function, and has cr-integral 
zero. For any dyadic interval Iq, it holds that {CT(Io]^^/^Io}U{h,5^ : I G , I C Iq} is an orthonormal 
basis for L^(Io, o"). We will use the notation Lo(Io, cr) for the subspace of L^(Io, cr) of functions 
with mean zero. It has orthonormal basis {hi : I G P , I C lo}. 
We will use the notations f(I) = {f,hi)a, as well as 



The second equality is the familiar martingale difference equality, and so we will refer to A^f as 
a martingale difference. It implies the familiar telescoping identity Ej^f — , Ej^AJ^f . 
The Haar support of a function f G L^(cr) is the collection {I : f(I) 0}. 

2.3. Good-Bad Decomposition. Since the works of Nazarov-Treil-Volberg [15-17], the use 
of random dyadic grids is a foundational technique in the settings in which the measures are 
non-doubling. Our uses of them employs only standard and well-known facts. 



2. Preliminaries 



(2.4). 



(2.1) 




Aff = (f, h'()^h^ = l+E'^ f + I^E^' f - lEff . 
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With a choice of dyadic grid V understood, we say that J G P is [e^r)-good if and only if for 
all intervals I E V with |I| > 2^^^|J|, the distance from J to the boundary of either child of I is 
at least |J|'|ir"'. 

For f G L^(cr) we set Pgoodf = Y. lec AJ^f. The projection P™od9 defined similarly. 

I is (e,T)-good 

With < e < 1 fixed, one will need to take r > e^^ , and any sufficiently large finite value suffices. 
The property of intervals being (e, r)-good is essential, and highlighted when the property is being 
used. 

2.4. Energy, Monotonicity, and Poisson. We collect results specific to the Hilbert transform; 
see [8, §2] and [11, §5] for more details. Throughout the paper, we use this definition of the 
Poisson integral of weight ct over interval I. 



P(a,I) := 



III 



rO-(d.ij) . 



(dist(x,I) + |I|)2 

Frequently, a has a further restriction on its support, clearly indicated in the notation. 

2.2. Lemma (Monotonicity Property). Suppose that y is a signed measure, and \l is a positive 
measure with ]x > both supported outside an interval \ E V. Then, for good J C I and 
2''+^|J| < |I|, and function g G L^(J,w), it holds that 

(2.3) |(HA',g)J < (H^,g)^^P(^,J)(^,g 

Here, g = X.j'l9(J')|l^r' ^ Haar multiplier applied to g. 

The concept of energy is fundamental to the subject. For interval I, define 

I I J:JCI 

Now, consider the energy constant, the smallest constant £ such that this condition holds, as 
presented or in its dual formulation. For all intervals Iq, all partitions V of Iq, it holds that 

(2.4) Y_ I^'^^^' I)'^^^^ ^ • 

lev 

1 in 

2.5. Lemma. [Energy Inequality] There holds £ < Ai +7 = 'K. 
For a proof, see [8, Proposition 2.11]. 

2.6. Remark. One should note that for interval J C lo, and interval ]' C J, there holds 

(H.do - J], H]^)«. > P(o-(Io - J), J')(^, H]^)^ . 

And, while the inequality is strict in general, we can reverse it if ]' is good and ]' d J. This 
distinction is basic to the subject, and drives some of the case analysis in the proof of Lemma 4.6. 
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2.7. Remark. The influential pivotal condition of Nazarov-Treil-Volberg [18] is obtained from the 
energy condition by setting E(w, I) = 1. Namely, it is the assumption on the pair of weights that 
there is a finite positive constant CP such that for all intervals Iq, all partitions V of Iq, it holds 
that 

^P(aIo,I)'w(I] <T2a(Io]. 

lev 

And, the dual inequalities also hold. Note that Plfflo, I) < inf^ei M.[alo){x), so that boundedness 
of the maximal functions M.^^ : L^(ct) —> L^(w) and M.^ : L^(w) I-^(cr) imply the pivotal 
condition. Though they wrote that the pivotal conditions 'might turn out to be necessary', this 
was disproved in [8]. 

3. The Global to Local Reduction 

The first half of the proof follows from the techniques of [11], though that paper does not prove 
a result in the form that we need it. The goal is the reduction to the local estimate, (3.14), at 
the end of this section. 

Our aim is to prove 

(3.1) |(H,f,g)^| < J{||f|U||g|U, 

where here and throughout "K := + 7. And, as methods are of necessity focused on L , we 
systematically abbreviate ||f||L2((r] to ||f||o-- 

The functions f G L^(ct), and g G L^lw) are expanded with respect to the Haar basis with 
respect to a fixed dyadic grid V, and adapted to the weight in question. 

A reduction, using randomized dyadic grids, allows one the extraordinarily useful reduction in 
the next Lemma. This is a well-known reduction, due to Nazarov-Treil-Volberg, explained in full 
detail in the current setting, in [18, §4]. 

3.2. Lemma. For all sufficiently small e, and sufficiently large r, this holds. Suppose that for any 
dyadic grid V, such that no endpoint of an interval I E V is a point mass for a oryv,^ there holds 

l(HaPgoodf,Pro°d9)>v|<:K||f||,||g|U. 

Then, the same inequality holds without the projections Pgood- ^"'^ P^od- namely (3.1) holds. 

That is, the bilinear form only needs to be controlled for {e,r)-good functions f and g, goodness 
being defined with respect to a fixed dyadic grid. Suppressing the notation, we write 'good' for 
'(e, r)-good,' and it is always assumed that the dyadic grid V is fixed, and only good intervals 
are in the Haar support of f and g, though is also suppressed in the notation. We clearly remark 
on goodness when the property is used. 

It is sufficient to assume that f and g are supported on an interval Iq; by trivial use of the 
interval testing condition, we can further assume that f and g are of integral zero in their respective 

^ This set of dyadic grids that fail this condition have probability zero in standard constructions of the random 
dyadic grids. 
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spaces. Thus, f is in the linear span of (good) Haar functions h,^ for I C Iq, and similarly for g, 
and 

(H,f,g)^= Y. (HaArf,Afg)^. 

IJ:IJClo 

The double sum is broken into different summands. Many of the resulting cases are elementary, 
and we summarize these estimates as follows. Define the bilinear form 

B^^°-(f, 9):= ]i Y- • ^r9)w 

I:IClo J:JeI 

where here and throughout, J <£ I means J C I and 2^+^|J| < |I|. In addition, the argument of 
the Hilbert transform, Ij, is the child of I that contains J, so that Aff is constant on Ij, and 
Ej^A^'f = Ef^A'^f. Define B^"'°™(f, g) in the dual fashion. 

3.3. Lemma. There holds 

(H,f, g)^ - B^^°^nf, g) - B^^'°-(f, g)| < :K||f ||,|| g|U • 

This is a common reduction in a proof of a Tl theorem, and in the current context, it only 
requires goodness of intervals and the Aj condition. For a proof, one can consult [18,31]. The 
Lemma is specifically phrased and proved in this way in [10, §8]. 

Thus, the main technical result is as below; it immediately supplies our main theorem. 

3.4. Theorem. There holds 

|B^''°-(f,g)|<:K||f||,||g|U. 
The same inequality holds for the dual form B'^^'°"'(f, g). 

3.5. Remark. We emphasize that partial information about B^'^™^(f, g] may not yield any substan- 
tive information about the bilinear form (Ho^f, g)w^ Nevertheless, [11] proved a characterization 
of the two weight inequality-using the a different tool, the so-called parallel corona. 

In the remainder of this section, we recall techniques from [11] that permit reduction of the 
global Theorem 3.4 to a localized setting in which the function f is more structured in that it has 
bounded averages on a fixed interval, and the pair of function f , g are more structured in that 
their Haar supports avoid intervals that strongly violate the energy inequality. 

3.6. Definition. Given any interval Iq, define J^energy(Io) to be the maximal subintervals I Q Iq 
such that 

P(aIo,J)'E(w,J)^w(J]>10£Ml). 

Here, £ is the constant in (2.4), and it holds that £ < CK. There holds o-(U{F : F G I'ilo]}) < 
^a(Io), by the energy inequality. 



^To illustrate, using the techniques of [11], one can show that the norm of the bilinear form B^'^°^^(f, g) is 
dominated by Jf + Soo. where the latter constant is the best constant in |B^'^°^^(f, g)| < 'BooCr(Io)'^^||g||w. where 
|f| < lo- This highly non-trivial fact has no implication for the full bilinear form (H(jf, g)w. 
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We make the following construction for an f G Lo(Io, cr), the subspace of L^(Io, cr) of functions 
of mean zero. Add Iq to T , and set af(Io) := EJ^glfl- In the inductive stage, if F G J-" is minimal, 
add to T those maximal descendants F' of F such that F' G J^energy(F) or Ep|f| > lOaf(F). Then 
define 



<Xf(F') 



Uf(F) 
\E-,|f| 



E^,|f| < lOcXf(F) 
otherwise 



If there are no such intervals F', the construction stops. We refer to T and cXf(-) as Calderon- 
Zygmund stopping data for f, following the terminology of [10, Def 3.5], [11, Def 3.4]. Their key 
properties are collected here. 

3.7. Lemma. For J' and 0Cf{-) as defined above, tfiere tiolds 

(1) Iq is the maximal element of F. 

(2) For all lev, Id Iq, we have E^fl < 10af(7r^I]. 

(3) af is monotonia IfY^f G T and F C F' then af(F) > af(F'). 

(4) The collection T is (J-Carleson in that 



(3.8) )_ a(F]<2cT(S], 

FeJ": FCS 



SeV. 



(5) We have the inequality 



(3.9) 



}^af(F)-F 



< 



Proof The first three properties are immediate from the construction. The fourth, the a-Carleson 
property is seen this way. It suffices to check the property for S G J-". Now, the J^-children can 
be in J-'energy(S), which satisfy 



F'G^ 2nergy(S} 



but these intervals satisfy the same estimate. Hence, (3.8) 

If 



Or, they satisfy E^,|f| > lOE^ 
holds. 

For the final property, let ^ C J-" be the subset at which the stopping values change 
F G J-" — ^, and G is the ^-parent of F, then af(F) = cXf(G). Set 

On := 



Define Gk := {Og > 2*^}, for k = 0,1, The a-Carleson property implies integrability of all 

orders in cr-measure of Oq. Using the third moment, we have cr(Gic) < 2^^^cr(G). Then, estimate 



^af(F)-F = ^af(G)(DG 



Gee 
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< 



k=0 
oo 



< 



k=0 
oo 



Geg 



Geg 



k=o Geg 
<^ocAG]MG]<\\Mf\\l<\\i 



Geg 



Note that we have used Cauchy-Schwarz in k at the step marked by an *. In the step marked 
with **, for each point x, the non-zero summands are a (super)-geometric sequence of scalars, 
so the square can be moved inside the sum. Finally, we use the estimate on the a-measure of Gk, 
and compare to the maximal function Mf to complete the estimate. 

□ 

We will use the notation 

lGX':7t^I=F 

and similarly for Q^. The inequality (3.9) allows us to estimate 

+ \\p?f\U}\\Q79\\^ 

(3.10) 

^{af(F)V(F) + ||Pp^f||i}x^||Qrg| 
jeT feT 
We will refer to as the quasi-orthogonality argument. It is very useful. 

The Theorem below is the essence of the reduction from a global to local estimate in our proof. 

3.11. Theorem. [Global to Local Reduction] There holds 

B^^°-(f,g)-B3.^°-(f,g]|<J{||f|U| 
where B^^°^^(f, g] := ^ 



< 



1/2 



< 



> above 



A reduction of this type is a familiar aspect of many proofs of a T1 theorem, proved by exploiting 
standard off-diagonal estimates for Calderon-Zygmund kernels. It is one of the contributions of 
[18] to point out that such arguments are far more sophisticated in the two weight setting, and 
[11] showed that, with Calderon-Zygmund stopping data, the reduction can be made assuming 
the Ai and testing hypotheses. 

Proof. This is an immediate corollary to [11, Theorem 6.6], but we include enough detail here so 
that the reader need only examine the proof of the functional energy inequality in [11, §7]. The 
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latter inequality is a sophisticated extension of the energy inequality, one of the key innovations 
of that paper. It's proof is self-contained, except for the two weight inequality for the Poisson 
integral, proved by Sawyer [29, Thm 2].'' We have 

The form B5^™^(f, g) is the case of F = F' in the double sum above, hence we should bound 



(3.12) Y_ B^'°^n(i>Ff, Qrg] = }l(Ha(i>F, Qrg)v 

where (Dp := Y_ ^pf- 



F'eJ'iF'QF 

In the language of [11, Def 7.1], the sequence of functions {Q™g : F G J-"} is J^^-sdapted, a 
key component of the functional energy inequality. In particular, from [11, Cor 7.5], there follows 

nV2 



FeJP I : I3F 



■FeJ- 



= ^>^l|f||allg| 



By inspection, the sum on the left equals the expression in (3.12), so the proof is complete, aside 
from the functional energy inequality. □ 

It remains to control B5^°^^(f, g). Keeping the quasi-orthogonality argument in mind, we see 
that appropriate control on the individual summands is enough to control it. To describe what has 
been done, one must note that the functions Pff need not be bounded. But, they have bounded 
averages, and both functions Pf f and Q^g are well-adapted to the pair of weights w, cr. This is 
formalized in the next definition. 

3.13. Definition. Let Iq be an interval, and let 5 be a collection of disjoint intervals contained 
in S. A function f G Lo(Io,ct) is said to be uniform (w.r.t.S) if these conditions are met: 

(1) Each energy stopping interval F G J^energy(Io) is contained in some S E S. 

(2) The function f is constant on each interval S E S. 

(3) For any interval I which is not contained in any S E S, ^i\f \ < 1. 

We will say that g is adapted to a function f uniform w.r.t.iS, if g is constant on each interval 
S E S. We will also say that g is adapted to S. 

Let us define what we mean by the local estimate. The constant Siocai is defined as the best 
constant in 

(3.14) |B^^°-(f,g)| < S,o,3iWlo)^/'+ llfllJIIglU, 

where f, g are of mean zero on their respective spaces, supported on an interval Iq. Moreover, f 
is uniform, and g is adapted to f. The inequality above is homogeneous in g, but not f, since the 
term cr(Io)^'^^ is motivated by the bounded averages property off. 



gap in the proof of the Poisson inequality at [29, Page 542] can be fixed as in [30] or [9, Lemma 4.10]. 
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The reduction from global to local estimate is Theorem 3.11. The Lemma below, shows that 
it suffices to bound the local estimate. 

3.15. Lemma. There holds 

|B^'^°^nf,g)l<{Siocai + :K}||f|U||g|U. 

Proof. Let T and af(-) be standard Calderon-Zygmund stopping data for f. By Theorem 3.11, 
it suffices to bound 

For each F G J-", let Sf be the J^-children of F. Observe that the function 
(3.16) (Caf(F))-^Pp^f 

is uniform on F w.r.t.iSp, for appropriate absolute constant C. Moreover, the function Q^g does 
not have any interval J in its Haar support contained in an interval S G Sf. That is, it is adapted 
to the function in (3.16). Therefore, by assumption, 

|gabove^paf^QWg^| < a3,o,3i{aF (F) a(F) ^ + || P^^f || J|| 9 |U . 

The sum over F G J-" of the right hand side is bounded by the quasi-orthogonality argument of 
(3.10). □ 

Thus, it remains to show that 'S>\oca\ ^ The following reduction in the local estimate is a 
routine appeal to the testing condition. Focusing on the argument of the Hilbert transform in 
(3.14), we write Ij = Iq — (lo — Ij)- When the interval is Iq, and J is in the Haar support of g, 
notice that the scalar 

£j := Y_ ^1^1^ 

Iijelcio 

is bounded by one. Say that f is uniform w.r.t.5, and let be the minimal interval in the Haar 
support of f with J <£ I. Since g is adapted to f, we cannot have Ij" contained in an interval of 
S, and so |EJ^-f| < 1 . By the telescoping identity for martingale differences, 

£j= Y_ EfjA^'f = Ef-f , 

I:I-ClClo ' 

which is at most one in absolute value. 
Therefore, we can write 

Y_ XEfAff-(HJo,Afg) = (hJo, Y. ^J^fs 

I:ICloJ:JeI J : J<sIo 

J : Jelo 
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This uses only interval testing and orthogonality of the martingale differences, and it matches the 
first half of the right hand side of (3.14). 

When the argument of the Hilbert transform is Iq — Ij, this is the stopping form, the last 
component of the local part of the problem. The treatment of it, in the next section, is the main 
novelty of this paper. 

4. The Stopping Form 
Given an interval Iq, the stopping form is 
(4.1) Bf;nf,g):= Y_ }I A^^f • (H,(Io - Ij), Af g)^ . 

I:IClo J:JaI, 

We prove this for the stopping form, which completes the proof of the inequality Siocai ^ and 
so in view of Lemma 3.15, completes the proof of the main theorem of this paper. Note that the 
hypotheses on f and g are that they are adapted to energy stopping intervals. (Bounded averages 
on f are no longer required.) 

4.2. Lemma. Fix an interval Iq, and let f and g be be adapted to J-'energy(Io)- Then, 
|Bf7(f,g)|<J{||f|U||g|U. 

The stopping form arises naturally in any proof of a Tl theorem using Haar or other bases. In 
the non-homogeneous case, or in the Tb setting, where (adapted) Haar functions are important 
tools, it frequently appears in more or less this form. Regardless of how it arises, the stopping 
form is treated as a error, in that it is bounded by some simple geometric series, obtaining decay 
as e.g. the ratio |J|/|I| is held fixed. (See for instance [18, (7.16)].) 

These sorts of arguments, however, implicitly require some additional hypotheses, such as 
the weights being mutually Aoo. Of course, the two weights above can be mutually singular. 
There is no a priori control of the stopping form in terms of simple parameters like |J|/|I|, even 
supplemented by additional pigeonholing of various parameters. 

Our method is inspired by proofs of Carleson's Theorem on Fourier series [2,6,12], and has 
one particular precedent in the current setting, a much simpler bound for the stopping form in 
[11, §6.1]. 

4.1. Admissible Pairs. A range of decompositions of the stopping form necessitate a somewhat 
heavy notation that we introduce here. The individual summands in the stopping form involve 
four distinct intervals, namely lo, I, Ij, and J. The interval Iq will not change in this argument, 
and the pair (I, J) determine Ij. Subsequent decompositions are easiest to phrase as actions on 
collections Q of pairs of intervals Q = (Qi, Qi] with Qi m Qi. (The letter P is already taken for 
the Poisson integral.) And we consider the bilinear forms 

BQ(f,g) := ^E^^A^,f ■ (H,(Io- (Qi)Qj,A^^g)^. 

QgQ 

We will have the standing assumption that for all collections Q that we consider are admissible. 
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4.3. Definition. A collection of pairs Q is admissible if it meets these criteria. For any Q = 

(Qi,Q2)gQ, 

(1) Qz Qi c lo. 

(2) (convexity in Qi) If Q" e Q with Q^' = Q2 and Q\' C I C Qi, then there is a Q' G Q 
with Q; = I and = Q2. 

The first property is self-explanatory. The second property is convexity in Qi, holding Q2 fixed, 
which is used in the estimates on the stopping form which conclude the argument. A third property 
is described below. 

We exclusively use the notation Q]^, k = 1,2 for the collection of intervals U{Qk '■ Q G Q}, 
not counting multiplicity. Similarly, set Qi :={(Qi)q2 : Q G Q}, and Qi := (Qi)q2- 

(3) No interval K G Qi U Q2 is contained in an interval S G J^energy(Io)- 

The last requirement comes from the assumption that the functions f and g be adapted to 
Jenergy(Io)- We will be appealing to different Hilbertian arguments below, so we prefer to make 
this an assumption about the pairs than the functions f, g. 

The stopping form is obtained with the admissible collection of pairs given by 

(4.4) Qo={(I,J] : Jgl,!^ U{S : S}}. 

In this definition S is the collection of subintervals of lo which f is uniform with respect to. There 
holds Bi*°''(f, g) = Bgjf, g) for f, g adapted to J'energy(Io)- 

There is a very important notion of the size of Q. 

s,ze(Q) := sup — — ^ }_ (x,hj)^. 

KeQiUSa 0-tKJ|K| jeQ^ijcK 

For admissible Q, there holds size(Q) < "K, as follows the property (3) in Definition 4.3, and 
Definition 3.6. 

More definitions follow. Set the norm of the bilinear form Q to be the best constant in the 
inequality 

|BQ(f,g)|<Bg||f|U||g|U. 

Thus, our goal is show that Bq < size(Q) for admissible Q, but we will only be able to do this 
directly in the case that the pairs (Qi, Q2) are weakly decoupled. 

Say that collections of pairs Q\ for j G N, are mutually orthogonal if on the one hand, the 
collections (Q')2 are pairwise disjoint, and on the other, that the collection (Q'), are pairwise 
disjoint. (The concept has to be different in the first and second coordinates of the pairs, due to 
the different role of the intervals Qi and Q2.) 

The meaning of mutual orthogonality is best expressed through the norm of the associated 
bilinear forms. Under the assumption that Bg = ^jg^Bgi, and that the {Q : j G N} are 
mutually orthogonal, the following essential inequality holds. 

(4.5) Bq < VIsupEgj . 

jeN 
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Indeed, for j G N, let TT-^ be the projection onto the linear span of the Haar functions {hj^ : J G Q!^, 
and use a similar notation for TT?. We then have the two inequalities 

Illinr9ll^< llgllw, }^||nff||i<2||f||^ 

jsN jeN 
Note the factor of two on the second inequality. Therefore, we have 

|Bs(f,g)|<}^|B2i(f,g)| 

= ^|B2Knff,n-g)| 

jGN 

< V BQj||nff||^||nfg|U < V2supBQj ■ ||f||a||g|U. 

This proves (4.5). 

4.2. The Recursive Argument. This is the essence of the matter. 

4.6. Lemma. [Size Lemma] An admissible collection of pairs Q can be partitioned into collections 
Q'^^g^ and admissible Q^"^^", for t G N such that 

(4.7) Bq < Csize(Q) + (1 + ^2) sup B^s^aii , 

t * 

and sup size ( Q^-"^" )< lsize(Q]. 

teN 

Here, C > is an absolute constant. 

The point of the lemma is that all of the constituent parts are better in some way, and that 
the right hand side of (4.7) involves a favorable supremum. We can quickly prove the main result 
of this section. 

Proof of Lemma 4.2. The stopping form of this Lemma is of the form BQ(f, g) for admissible 
choice of Q, with size(Q) < CK, as we have noted in (4.4). Define 

C(A) := sup{Bq : size(Q] < CA:K}, < A < 1 , 

where C > is a sufficiently large, but absolute constant, and the supremum is over admissible 
choices of Q. We are free to assume that Qi and Qz are further constrained to be in some fixed, 
but large, collection of intervals X. Then, it is clear that C(A) is finite, for all < A < 1 . Because 
of the way the constant "K enters into the definition, it remains to show that C(1) admits an 
absolute upper bound, independent of how X is chosen. 

It is the consequence of Lemma 4.6 that there holds 



C(A) < CA+(1 + V2)C(A/4), 0<A<1 
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Iterating this inequality beginning at A = 1 gives us 

oo 

ai] < C + (1 + v^)C(1/4) < • • ■ < c'^l^f 

t=0 

So we have established an absolute upper bound on □ 



< 4C. 



4.3. Proof of Lemma 4.6. We restate the conclusion of Lemma 4.6 to more closely follow the 
line of argument to follow. The collection Q can be partitioned into two collections Q'^''^^ and 
Q^"^^" such that 

(1) Bgiarge < T, where t — size(Q). 

(2) Q'""^" = Q]""^^^ U Ql""^". 

(3) The collection Q^"^^" is admissible, and sizelQ^"^^") < j. 

(4) For a collection of dyadic intervals C, the collection Ql"^^" is the union of mutually or- 
thogonal admissible collections Qf^^\ for L E C, with 

size(Q^7"]<|, Lg£. 

Thus, we have by inequality (4.5) for mutually orthogonal collections, 

!Bq ^ ^Bglarge ~|~ !B Qsmalt|jQsmall 

< B glarge + B /nsmall + B /~)small 

< Ct+(1 + Vl) maxj^BQsmall, sup Bgsmall J- . 

This, with the properties of size listed above prove Lemma 4.6 as stated, after a trivial re-indexing. 

All else flows from this construction of a subset C of dyadic subintervals of Iq. The initial 
intervals in C are the minimal intervals K G Qi U Qi such that 

(4.8) ^-Mh^ £ (x.Hr>i4.(K,. 

' ' JGQ2:JCK 

Since size(Q) = t, there are such intervals K. 

Initialize S (for 'stock' or 'supply') to be all the dyadic intervals in Qi U Q2 which are not 
contained in any element of C. In the recursive step, let C be the minimal elements S E S such 
that 

(4.9) Y. ^^r)w>p T_ T_ ^^r)w> p = tI- 

JeQ2:JCS Le£:LcS JeSzrJCL 

L is maximal 

(The inequality would be trivial if p = 1.) If C is empty the recursion stops. Otherwise, update 
£ <- £ U and 5 ^ {K G 5 : K ^ L VL G £}. 

Once the recursion stops, report the collection C It has this crucial property: For L G £, and 
integers t > 1 , 

(4.10) Y. I. (^>H7)i<p"' L 

L':7Tt.L'=LjeS2:JCL' JSQa : JCL 
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Figure 1. The shaded smaller tents have been selected, and Tl is the minimal 
tent with |J.(Tl) larger than p times the [^-measure of the shaded tents. 

Indeed, in the case of t = 1 , is the selection criteria for membership in C, and a simple induction 
proves the statement for all t > 1 . 

4.11. Remark. The selection of C can be understood as a familiar argument concerning Carleson 
measures, although there is no such object in this argument. Consider the measure |a. on given 
as a sum of point masses given by 

[i:= Y_ (^'l^r)wS(x,,|j|) , xj is the center of J. 

JeQ2:JClo 

The tent over L is the triangular region Tl := {(x, ij) : |x — xl| < |L| —y}, so that 

JGQ2:JCL 

Then, the selection rule for membership in C can be understood as taking the minimal tent Tl 
such that |J.(Tl) is bigger than p times the ^-measure of the selected tents. See Figure 1. 

The decomposition of Q is based upon the relation of the pairs to the collection C, namely a 
pair QijQi can (a) both have the same parent in C; (b) have distinct parents in C; (c) Qi can 
have a parent in C, but not Qi; and (d) Qi does not have a parent in C 

A particularly vexing aspect of the stopping form is the linkage between the martingale difference 
on g, which is given by }, and the argument of the Hilbert transform, Iq — Ij. The 'large' collections 
constructed below will, in a certain way, decouple the J and the Iq — Ij, enough so that norm of 
the associated bilinear form can be estimated by the size of Q. 

In the 'small' collections, there is however no decoupling, but critically, both the size of the 
collections is smaller, and that the estimate is given in terms of the supremum in (4.7). 

Pairs comparable to C. Define 

QL,t:={QG Q : Tt^Qi =<Q2 = L}, LeC.teM. 

These are admissible collections, as the convexity property in Qi, holding Q2 constant, is clearly 
inherited from Q. Now, observe that for each t G N, the collections {Ql.i : L G £} are mutually 
orthogonal. The collection of intervals (QL,t)2 are obviously disjoint in L G £, with t G N held 
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fixed. And, since membership in these collections is determined in the first coordinate by the 
interval Qi, and the two children of Qi can have two different parents in C, a given interval I 
can appear in at most two collections (QL,t)i, as L G £ varies, and t G N held fixed. 
Define Qf"^^" to be the union over L G £ of the collections 

Note in particular that we have only allowed t = 1 above, and Qi = L is not allowed. For these 
collections, we need only verify that 

(4.12) size(Q^7"]< ^(p-l).T=|, LG/:,tGN. 

Proof. An interval K G (QfY^")! U Qi is not in L, by construction. Suppose that K does not 
contain any interval in C By the selection of the initial intervals in C, the minimal intervals in 
Qi U Qi which satisfy (4.8), it follows that the interval K must fail (4.8). And so we are done. 

Thus, K contains some element of £, whence the inequality (4.9) must fail. Namely, rearranging 
that inequality. 



JCK L' is 

Recall that p — 1 = i^- We can estimate 



J6Q2 : n£j=L L'e£ : L'cK JeQ2 : JCL 

JCK L' is maximal 



JeQ2:7t£j=L JeS2:JCL 
JCK 

|K|2-CT(K) 



16 P(a(L-K),K)2 • 

The last inequality follows from the definition of size, and finishes the proof of (4.12). □ 

The collections below are the first contribution to Take Q'"'^' := U{Q''l^^ : L G £}, 

where 

Q'T:={Qe Ql,i : Qi =L}. 

Note that Lemma 4.17 applies to this Lemma, take the collection S of that Lemma to be {L}, and 
the quantity x\ in (4.18) satisfies rj < t = size(Q), by inspection. From the mutual orthogonality 
(4.5), we then have 

Enlarge ^ sup Enlarge ^ T . 

The collections QL,t. for L G £, and t > 2 are the second contribution to Q'^*^^^, namely 
U U 2L,t • 

Le£t>2 
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For them, we need to estimate Bq^^^. 

(4.13) B2,^<p-^/2t. 

From this, we can conclude from (4.5) that 

Bglarge < ^B^J^Q^^^ . ^^^^ 



^1 

t>2 



t>2 '-^■^ t>2 



Proof of {A. 13). For L G £, let S]_, the ^-children of L. For each Q G Qi^t, we must have 
Qi C tt^lQz C Qi. Then, divide the collection Qi^t into three collections Q[^^, £ = 1,2,3, where 

Qlt:={Qe QL,t : Q2<g7r5,Q2}, 
Qtt := {Q e QL,t : Q2 ^ 7t5,Q2 <^ Qi}, 

and Q^t •= Qut ~ iQlt^ Qit) the complementary collection. Notice that equals the 
whole collection QL,t for t > r + 1 . 

We treat them in turn. The collections Q]_^ fit the hypotheses of Lemma 4.17, just take the 
collection of intervals S of that Lemma to be 5l. It follows that Bgi^ < P(t], where the latter 
is the best constant in the inequality 

(4.14) Y. P(cT(Io-K],J]^(^,hf)^<|3(t)V(K), KG5L,LG/:,t>2. 



w 

Je(SL.t)2:JeK I-" 



There is an observation about the Poisson integral terms that we need. For K as above, and 
J C L' g K, note that by goodness of L', 

dist(J, lo - K) > dist(L', lo - K) > iL'I'^lKf-'^ > 2'^+^"^-'^'|L'| . 

From the definition of the Poisson integral, one sees that 

P(o-(Io-K),J) ^ P(g(Io-K),LO 



We have the estimate without decay in t, (3(t) < size(Q). Indeed, for K as in (4.14), let J'* 
be the maximal intervals with J* G (QL,t)2 and J* d K. Now, J'* is contained in the collection of 
intervals over which we test the size of Q, hence by (4.15), 

LHS(4.14)=^ Y_ P(o-(Io-K),J)^(^,h-)^ 
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This proves the claim, and we use the estimate for t < r + 3, say. (Recall that r is a fixed integer.) 

In the case of t > r + 3, the essential property is (4.10). The left hand side of (4.14) is 
dominated by the sum below. Note that we index the sum first over L', which are r+ 1-fold 
^-children of K, whence L' e K, followed by t — r — 2-fold /^-children of L'. 

L L L pi-(io-K),i)^(^.hr)l 

L'e£ L"eC J6S2:JCL" I-" 

L'ec ' ' V'ec 



w\ 2 



Vec ' ' JeQ2:JCL' 



We have also used (4.15), and then the central property (4.10) following from the construction 
of C, finally appealing to the definition of size. Hence, P(t) < x^p^*. This completes the analysis 
of Ql,. 

We need only consider the collections Ql^ for ] < t < r + 1, and they fall under the scope of 
Lemma 4.22. And, we see immediately that we have Bg2 < t. Similarly, we need only consider 

the collections Ql^ for 1 < t < r + 1. It follows that we must have I" < IQ1I/IQ2I < 2^^+^. 
Namely, this ratio can take only one of a finite number of values, implying that Lemma 4.24 
applies easily to this case to complete the proof. □ 

Pairs not strictly comparable to C. It remains to consider the pairs Q E Q such that Qi does 
not have a parent in C The collection Ql"^^" is taken to be the (much smaller) collection 

:={Q G Q : Q2 does not have a parent in C} . 



Observe that size(Q2'^^") < y(p — 1]t < |. This is as required for this collection.^ 

Proof. Suppose t] < size(Q|'^""). Then, there is an interval K G U (Q|"^'"]2 so that 

2 ^ P(cT(Io-K),K)^ ^ , 
o-(K) < — p 2_ (^>^)w 

JCK 

Suppose that K does not contain any interval in C It follows from the initial intervals added to 
C, see (4.8), that we must have "H < |. 



^The collections and Ql^^^" are also mutually orthogonal, but this fact is not needed for our proof. 
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Thus, K contains an interval in C This means that K must fail the inequality (4.9). Therefore, 
we have 

2 (^/\ ^ ! -, ^ P(cr(Io — K), K)^ ^ 2 ^ n/A 

Tl O-(K) < (p- 1) — p 2_(^'^r)^ < • 

' ' JeS2 

JCK 

This relies upon the definition of size, and proves our claim. □ 

For the pairs not yet in one of our collections, it must be that Qz has a parent in C, but not 
Qi. Using C* , the maximal intervals in C, divide them into the three collections 

Q!r^':={QeQ : Q2 <g tt^-Qi C Qi}, 
Q!:'^':={QeQ : Q2 ^ tt^.Qi Qi}, 
q!,''^' := {Q E Q : Q2 ^ tt^.Qz Q Qi , and tt^.Qz ^ Qi} . 
Observe that Lemma 4.17 applies to give 

(4.16) Bgh,rge < T. 

Take the collection S of Lemma 4.17 to be C*, and note that the bound in that Lemma is given 
by r\, as defined in (4.18), which by construction is less than t = size(Q). 

Observe that Lemma 4.22 applies to show that the estimate (4.16) holds for Q!^^^^. Take S 
of that Lemma to be C*. The estimate from Lemma 4.22 is given in terms oft], as defined in 
(4.23). But, is at most t. 

In the last collection, Qs'^^^, notice that the conditions placed upon the pair implies that 
IQil < 2^^^^|Q2|. fot" all Q ^ Qs*^^^- It therefore follows from a straight forward application of 
Lemma 4.24, that (4.16) holds for this collection as well. 

4.4. Upper Bounds on the Stopping Form. We have three lemmas that prove upper bounds 
on the norm of the stopping form in situations in which there is a measure of decoupling between 
the martingale difference on g, and the argument of the Hilbert transform. 

4.17. Lemma. Let S be a collection of pairwise disjoint intervals in Iq. Let Q be admissible such 
that for each Q E Q, there is an S E S with Q2 s S C Qi . Then, there holds 

|BQ(f,g)| <Ti||f||,||g|U, 
(4.18) where := -p ^ P(ct(Io - S), J)^^, h-)^ . 

(Note that size(Q) need not control r\.) 

Proof An interesting part of the proof is that it depends very much on cancellative properties 
of the martingale differences of f. (Absolute values must be taken outside the sum defining the 
stopping form!) 

Assume that the Haar support of f is contained in Q-\. Take J-' and (Xf(-) to be stopping data 
defined in this way. First, add to the interval Iq, and set af(Io) :=E5^^|f|. Inductively, if F G J-" 
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is minimal, add to T the maximal children F' such that af(F') := Ep |f| > 4c<,f(F). Note that 
the inequality (3.9) holds for this choice of T and af, so that the quasi-orthogonality argument 
(3.10) is available to us. 
Write the bilinear form as 

J 

(4.19) where cp, := Y_ f • (lo - Qi ) . 

Q6S:Q2=J 

The function cpj is well-behaved. For any J G Qi. I^Pjl ^ ^[j^t])^- 'n this definition, AJ := 
U{Io ~ Qi '■ Q G Q, Q2 = J}- Indeed, at each point x G AJ, the sum defining (pj(x) is over 
pairs Q such that Q2 = J and x G Iq — Qi. By the convexity property of admissible collections, 
the sum is over consecutive martingale differences of f. The basic telescoping property of these 
differences shows that the sum is bounded by the stopping value o^[tij:\]. Let I* be the maximal 
interval of the form Qi with x G Iq — Qi, and let I* be the child of the minimal such interval 
which contains J. Then, 



|(pj(x) 



(4.20) 



QeS:Q2=J 
xei-Q, 



if.f-E^J <af(7t^J)(Io-S), 



where S is the 5-parent of J. 

We can estimate as below, for F G J-": 



E(F] 



(4.19) 



Y_ EQ,A^,f-(H,(Io-Qi],Afg), 

QeS:7r^Q2=F 

Y_ (H,(pj,Af'g), 



(4.20) 



■20) ^ ^ / X 

< (Xf(F) Y_ }lP(cT(Io-S),J)|(-,A]"g 



Se5 jeS2 
n^S=F jcs 



< cXf(F) 



Y_ }1p(ct(Io-S)J)^(^,H]^ 

s-s jeQ2 

S=F JCS 
:^S=F 

< Ti(Xf(F)a(F)V^ 



1 1/2 



J6Q2 



(4.18) 

< TTXf(F) 



JeS2 

7t^J=F 



jGS2:7t^J=F 



1 1/2 



1/2 
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The top line follows from (4.19). In the second, we appeal to (4.20) and monotonicity (2.3), the 
latter being available to us since J C S implies } d S, by hypothesis. We also take advantage of 
the strong assumptions on the intervals in Qz. If J G Qi, we must have Ttj-J — njr[ns]). The 
third line is Cauchy-Schwarz, followed by the appeal to the hypothesis (4.18), while the last line 
uses the fact that the intervals in S are pairwise disjoint. 

The quasi-orthogonality argument (3.10) completes the proof, namely we have 

(4.21) ^Z[T)< 



□ 



4.22. Lemma. LetS be a collection of pairwise disjoint intervals in Iq. Let Q be admissible such 
that for each Q E Q, there is an S E S with Q2 C S d Qi. Then, there holds 

|Bs(f,g)|< 



(4.23) where := sup —f^ > (x,hp 



w\2 
w ■ 



Proof Construct stopping data and af(-] as in the proof of Lemma 4.17. The fundamental 
inequality (4.20) is again used. Then, by the monotonicity principle (2.3), there holds for F G J-", 



Y_ EQ,A^,f-(H,(Io-Qi],A- g), 

QeS:7t^Q2=F 



Se5:7t^S=F 



JeS2:JCS 



<Tiaf(F] 



Se5:7r^S=F JeQ2:JCS ' ' JeS2:JCS 
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Y_ o-(s)x Y. ^^jy 

Se5:7TjrS=F JeQ2:JCS 

1/2 



1/2 



<TicXf(F]a(F) 



1/2 



L my 

j6S2:7t^J=F 



After the monotonicity principle (2.3), we have used Cauchy-Schwarz, and the definition of r\. 
The quasi-orthogonality argument (3.9) then completes the analysis of this term, see (4.21). □ 

The last Lemma that we need is elementary, and is contained in the methods of [18]. 

4.24. Lemma. Letu > r-|- 1 be an integer, and Q be an admissible collection of pairs such that 
IQil = 2^|Q2| for all Q E Q. There holds 

|B2(f,g)|<size(Q)||f|U| 
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Proof. Recall the form of the stopping form in (4.1). Observe, from inspection of the definition 
of the Haar function (2.1), that 



Aff I < 



-1,^1 



If (I) I 

0-(IrlV2 • 



Then, we have, keeping in mind that Ij is one or the other of the two children of I, 



leQ, 



< 



L Z ^P(''(i"-ii)-i)(^-hr>J9(i)l 

les, I-J:(IJ16Q 

< size(Q)||f||a||g||w 



1/2 



This follows immediately from Cauchy-Schwarz, and the fact that for each J E Qi, there is a 
unique I G Qi such that the pair (I, J) contribute to the sum above. □ 
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